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ABSTRACT Micr oar rays containing 1046 human cONAs 
of unknown sequence were printed on glass with high-speed 
robotics. These 1.0-cm* DNA "chips" were used to quantita- 
tively monitor differential expression of the cognate human 
genes using a highly sensitive two-color hybridization assay. 
Array elements that displayed differential expression patterns 
under given experimental conditions were characterized by 
sequencing. The identification of known and novel heat shock 
and phorbol ester-regulated genes in human T cells demon- 
strates the sensitivity of the assay. Parallel gene analysis with 
microarrays provides a rapid and efficient method for large- 
scale human gene discovery. 



Biology has entered the genome era (1). Complete genome 
sequences for all of the model organisms and human will 
probably be available by the year 2003 (2). Torrents of human 
expressed sequence tags (ESTs) provide a starting point for 
elucidating the function of tens of thousands of cognate genes 
(3). Genome analysis will provide insights into growth, devel- 
opment, differentiation, homeostasis, aging, and the onset of 
diseases (1-3). A detailed understanding of the human genome 
will require the implementation of sophisticated methods for 
gene expression analysis and gene discovery. 

Recently, a microarray-based method for high-throughput 
monitoring of plant gene expression was described (4). This 
"chip"-based approach involved using microarrays of cDNA 
clones as gene-specific hybridization targets to quantitatively 
measure expression of the corresponding plant genes (4, 5). A 
two-color fluorescence labeling and detection scheme facili- 
tated seasitive differential expression analysis of different 
plant tissues (4, 5). The efficiency of this approach for studies 
in higher plants suggested the use of this method for human 
genome analysis (4-7). Here, we report the use of cDNA 
microarrays for human gene expression monitoring, biological 
investigation, and gene discovery. 

MATERIALS AND METHODS 

Human cDNA Clones. The cDNA library was made with 
mRNA from human peripheral blood lymphocytes trans- 
formed with the Epstein-Ban- virus. Inserts >600 bp were 
cloned into the lambda vector AYES-R to generate 10 7 -10 8 
recombinants. Bacterial rxansformants were obtained by in- 
fecting £. coli strain JM107/AKC. Colonies were picked at 
random and propagated in a 96-well format, and minilysate 
DNA was prepared by alkaline lysis using REAL preps 
(Oiagen, Chatsworth, CA). Inserts were amplified by PCR in 
a 96-well format using primers (PAN132, 5-CCTC- 
TATACTTTAACGTCAAGG; and PAN133, 5-TTGTGTG- 
GAATTGTGAGCGG) complementary to the AYES 
polylinker and containing a six-carbon amino modification 
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(Glen Research, Sterling, VA) on the 5' end. PCR products 
were purified in a 96-well format using OlAquick columns 
(Oiagen). 

Microarray Preparation. Amino-modified PCR products 
were suspended at a concentration of 0.5 mg/ml in 3x 
standard saline citrate (SSC) and arrayed from 96-wcll micro- 
liter plates onto sUyiated microscope slides (CEL Associates, 
Houston) using high-speed robotics (4-7). A total of 1056 
cDNAs, representing 1046 human clones and 10 Arabidopsis 
controls, were arrayed in 1.0-cm 2 areas. Printed arrays were 
incubated for 4 hr in a humid chamber to allow rehydration of 
the array elements and rinsed, once in 0.2% SDS'for 1 min, 
twice in H 2 0 for 1 min, and once for 5 min in sodium 
borohydride solution (1.0 g of NaBH4 dissolved in 300 ml of 
PBS and 100 ml of 100% ethanol). The arrays were submerged 
in H 2 0 for 2 min at 95°C, transferred quickly into 0.2% SDS 
for 1 min, rinsed twice in H 2 0, air dried, and stored in the dark 
at 25°C. 

Fluorescent Probes. Tissue mRNAs were purchased 
(CLONTECH). Jurkat mRNA was isolated as described by 
Schena et ai (4). Probes were made as described (4) with 
several modifications. The reverse transcriptase used here was 
Superscript II RNase H- (GIBCO). The Cy5-dCTP was 
purchased from Amersham. Each reverse transcription reac- 
tion contained 3.0 u.g of total human mRNA. Arabidopsis 
control mRNAs were made by in vitro transcription of cloned 
HAT4, HAT22, and YesAt-23 cDNAs (4, 8, 9) using an RNA 
Transcription Kit (Stratagene). For quantitation, the mRNAs 
were doped into the reverse transcription reaction at ratios of 
1:100,000, 1:10,000, and 1 : 1 000 (wt/ wt) respectively. Following 
the reverse transcription step, samples were treated with 2.5 uJ 
of 1 M sodium hydroxide for 10 min at 37°C, then neutralized 
by adding 2.5 of 1 M Tris-HCl (pH 6.8) and 2.0 »\ of 1 M 
HC1. Probe mixtures contained cDNA products derived from 
3 ug of total mRNA, suspended in 5.0 /il of hybridization 
buffer (5x SSC plus 0.2% SDS). 

Hybridization and Scanning. Probes were hybridized to 
1.0-cm 2 microarrays under a 14 x 14 mm glass coverslip for 
6-12 hr at 60°C in a custom-built hybridization chamber (4-7). 
Arrays were washed for 5 min at room temperature (25°C) in 
low stringency wash buffer (1 X SSC/0.2% SDS), then for 10 
min at room temperature in high stringency wash buffer (0.1 X 
SSC/0.2% SDS). Arrays were scanned in 0.1 x SSC using a 
fluorescence laser scanning device (4-7), fitted with a custom 
filter set (Chroma Technology, Brattleboro, VT). Accurate 
differential expression measurements (i.e., final fluorescence 
ratios) were obtained by taking the average of the ratios of two 
independent hybridizations. 



Abbreviation: EST, expressed sequence tag. 
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Cell Culture. Jurkat cells were grown in a tissue culture 
incubator (37°C and 5% C0 2 ) in RPMI medium supplemented 
with 10% fetal bovine serum, 100 jig of streptomycin per ml, 
and 500 units of penicillin per ml. Heat shock corresponded to 
a 4-hr incubation at 43°C Phorbol ester treated cells were 
grown for 4 hr in the presence of 50 ng of phorbol 12-myristate 
13-acetate (PMA) per ml. 

RNA Blotting. Dot blots were performed as described (4). 

DNA Sequencing. Sequences were obtained using the 
PAN132 and PAN 133 primers and a 373A automated se- 
quencer, according to the instructions of the manufacturer 
(Applied Biosystems). 

Computer Graphics and Informatics. Pseudocolor represen- 
tations of fluorescent images were made with National Institutes 
of Health image software (version 1.52). Software for differential 
expression representations was purchased from Imaging Re- 
search (St. Catherine's, ON, Canada). Sequence searches were 
made to the nonredundant nucleotide data base at the National 
Center for Biotechnology Information (NCBI) using Macintosh 
blast software. The EST data base was accessed via the World 
Wide Web (hnpr/www.ncbi.nlm.nih.gov/). 

RESULTS 

Gene Discovery and the Heat Shock Response. Microarrays 
were used to examine the heat shock response in cultured 
human T (Jurkat) cells. Control (37°C) and heat-treated 
(43°C) cells were harvested and lysed, and total mRNA from 
the two cell samples was labeled by reverse transcriptase 
incorporation of fluorescein- and Cy^dCTP, respectively! In 
a second set of labeling reactions, the f luorescent groups were 
"swapped" such that samples from control and heat-treated 
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samples were labeled with Cy5- and f iuorescein-dCTP. respec- 
tively. Each pair of fluorescent probes was hybridized to a 
1056-element microarray. The arrays were washed at high 
stringency and scanned with a confocal laser scanning device 
to detect emission of the two fluorescent groups. 

Hybridization signals were observed to >95% of the human 
cDNA array elements, but not to any of the Arabidopsis 
negative controls (Fig. 1). Fluorescence intensities spanned 
more than three orders of magnitude for the 1046 array 
elements surveyed (Fig. 1). Comparative expression analysis of 
heat shocked versus control cells in the two experiments 
revealed 17 array elements that displayed altered fluorescence 
ratios of >2.0-fold (Figs. 1 and 2/4). Of the 17 putative 
differentially expressed genes, 11 were induced by heat shock 
treatment and 6 displayed modest repression (Figs. 1 and 2A). 

To determine the identity of the heat-regulated genes. 
cDNAs corresponding to each of the 17 array elements were 
sequenced on the proximal and distal end. Data base searches 
revealed perfect matches for 14 of the 17 clones, and in each 
case proximal and distal cDNA sequences mapped to the same 
gene (Table 1). Of the 1046 human genes examined on the 
microarray, the five most highly induced in heat-treated cells 
were heat shock protein 90a (hsp90a), dnaJ, hsp9O0, polyu- 
biquitin, and t -complex polypeptide -I t tcp-1) (Table 1). Three 
of the 17 clones did not match any entry in the public data base, 
though one of the clones (B7) exhibited signiGcant homology 
to an EST from CaenorhabdUis elegans (Table 1). Each of the 
novel sequences (B7-B9) exhibited ~ 2-fold induction (Table 1) 
and relatively low-level expression (Table 2). 

To confirm the microarray results, mRNA levels for each of 
the genes were measured by RNA blotting. Each of the genes 
that displayed heat shock induction, including the three novel 
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Table 1. Microarray elements corresponding to differentially expressed genes 



Clone 



Row 



Column 



Ratio 



BI 

B2 

B3 

B4 

B5 

B6 

B7* 

B8 

B9 

BIO 

B11 

B12 

B13 

B14 

B15 

B16 

B17 

B18 

B19 

B20 

B21 

B22 

B23 



24 
1 

15 
32 
17 
22 
5 
2 
14 
7 
12 
28 
14 
20 
30 
10 
13 
7 
21 
3 
1 

22 
20 



Blast identity 



Accession no. 



21 

31 
8 

19 
8 

31 
4 

19 
5 
8 
2 
2 

7 
9 

12 
5 

16 

19 

30 

26 

18 

30 

16 



0.5 

0.5 

03 

03 

0.5 

0.5 

2.0 

2.0 

2.2 

2.4 

2.4 

23 

23 

2.6 

4.0 

5.8 

63 

2.0 

11 

12 

16 

33 

19 



CYC oxidase ITI 
0- Act in 

CYC oxidase III 

CYC oxidase III 

CYC oxidase III 

0-Actin 

Novel* 

Novel* 

Novel* 

Potyubiquilin 

TCP-1 

Polyubiquitin 

Polyubiquitin 

HSP90/3 

DnaJ homolog 

HSP90a 

HSP90o 

0rmicroglobulin 
Novel* 

^-microglobulin 
PGK 
NF-kBI 
PAC-1 



J01415, J01415 
NR, X00351 
J01415, J01415 
J01415, J01415 
J01415, J01415 
NR, X00351 
U56653, U56654 
U56655. U56656 
U56657, U56658 
XO4803, X04803 
X52882, X52882 
M17597, M17597 
X04803, X04803 
M16660. M16660 
D13388, D13388 
X07270, X07270 
M27024, X15183 
S54761* M30683 
U56659, U56660 
S54761, M30683 
M11968, LO0 160 
247744, M55643 
LI 1329. L11329 



. h*£L ^T' Uny - P ° SlQOn {Flg - f iuorcscence rati °, sequence identity, and acession number of cDNAs that manifested 
Hon^ heatshoclc. (Bl-17) or phorbol tM^^^^^S^ 

™ r^r % ^mf Cm,t - V t 0ve ; 300 nucleotide, were assumed to be identical to known sequences. XllTenes are nudear 

™ (jniwchondml) Accession numbers reflect the highest score for proximal and distaf science 
NfS^^^^ 10 "? C iJ^ U T-complex polypeptide: HSP, heat shock protein; PGK, phosphogl^eratf S 
SSnS faa ° r - kappaB; * AC '*> Phosphatase of activated eeHs; and NR, trace not readable due P to 'the pILc^o} 

*B7 is 67% identical to an EST from C ctegans (D76026). 
T No match in the public data bases. 
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Table 1 Human gene expression monitored by microarray and RNA blot analyses 



Expression level, per 10 s mRNAs 



Clone 


Blast identity 


Microarray 


Ratio 


OVA hlstr 

KJNA DIOt 




ol 


CYC oxidase ID 


92/46 


0.5 


100/80 


0.8 


B2 


0-Acrin 


240/120 


OS 


270/280 


1.0 


B3 


CYC oxidase III 


36/18 


OS 


ND 


ND 


B4 


j**rm. t r** * ■ www 

CYC oxidase m 


76/38 


OS 


ND 


ND 




CYC oxidase HI 


62/31 


OS 


ND 


ND 


Bo 


0-Actin 


180/89 


OS 


ND 


ND 


D7 

B/ 


Novel (weakly to D 76026) 


13/2.6 


10 


0.77/1.8 


23 


06 


Novel 


2.0/4.0 


10 


1.5/3.4 


23 


Of 


KI nttf% 1 

INOVCI 


0.8/1.8 


12 


1.2/1.8 


IS 


Din 

Dili 


Polyubiquitin 


0.8/1.9 


14 


25/89 


3.6 


Dll 


TCr-1 


23/5.5 


14 


7.1/27 


3.8 


BIZ 


Polyubiquitin 


0.8/10 


15 


ND 


ND 


ni ^ 

bl3 


Polyubiquitin 


1.7/43 


15 


ND 


ND 


B14 


HSP9O0 


75 /WO 


Z.0 


30/120 


4.0 


B15 


DnaJ homolog 


1.0/4.0 


4.0 


1.6/13 


8.1 


B16 


HSP90a 


0.6/3.5 


5.8 


3.2/29 


9.1 


B17 


HSP90« 


0.8/5.0 


63 


8.6/62 


7.2 


B18 


^microglobulin 


1.0/2.0 


10 


5.4/15 


2.8 


B19 


Novel 


1.2/15 


11 


4.5/9.5 


15 


B20 


^microglobulin 


2.7/5.9 


2.2 


ND 


ND 


B21 


Pliosphogiycerate kinase 


2.4/6.2 


16 


4.7/9.2 


2.0 


B22 


NF-KB1 


1.7/6.0 


3.5 


0.65/4.7 


7.2 


B23 


PAC-1 


0.5/9J 


19 


0.21/15 


71 



Shown are expression levels per 100,000 mRNAs (wt/wt) of genes assaved with a microarray (Fie 1) 
or RNA blot Ratios correspond to values from cells subjected to heat shock (Bl-17) or phorboi ester 
treatment (B18-23) relative to untreated cells. Gone and gene names are given in Table 1 ND not 
determined. 



sequences, exhibited elevated mRNA levels by dot blot analysis 
(Table 2). In all cases, expression ratios as determined by the 
two procedures differed by <2-fold for the genes identified in 
the heat shock experiments (Table 2). The two assays differed 
more widely in terms of assessing absolute expression levels; 
nonetheless, absolute expression as monitored on a microarray 
typically correlated with RNA blots to within a factor of five 
(Table 2). 

Phorboi Ester Signaling. To explore a signaling pathway 
distinct from the heat shock response, microarrays were used 
to examine the cellular effects of phorboi ester treatment. 
Jurkat cells were treated with phorboi ester, harvested, lysed, 
and used as a source of mRNA. Samples of mRNA from 
untreated or phorboi ester-stimulated cells were labeled with 
reverse transcriptase. The probes were mixed, hybridized to 
microarrays, and scanned for fluorescence emission of the two 
fluorescent groups. A total of six array elements displayed 
slO-fold elevated signals with probes from phorboi ester- 
treated cells relative to control samples (Fig. IB). 

To determine the identity of the phorboi ester-induced 
genes, clones corresponding to the six array elements were 
sequenced. Data base searches revealed perfect matches for 
five of the six sequences (Table 1). The two most highly 
induced genes were the PAC-1 tyrosine phosphatase and 
nuclear factor-kappa Bl (NF-tcBl); modest activation was 
observed for phosphoglycerate kinase and 02-microglobulin 
(Table 1). One remaining clone (B19) did not match any entry 
in the public data base (Table 1). B19 displayed a 2.1-fold 
induction and, similar to the novel heat shock genes, a rela- 
tively low absolute expression level (Tables 1 and 2). All six of 
the phorboi ester-inducible genes displayed increased steady- 
state mRNA levels by RNA blotting (Table 2). PAC-1 expres- 
sion (Fig. 1; Table 2) defined a detection limit of -1:500,000 
for the assay. 

Transcript Imaging in Human Tissues. To determine 
whether microarrays could be used to monitor expression in 
human tissues, probes were prepared from human bone mar- 



row, brain, prostate, and heart by labeling each mRNA sample 
with Cy5-dCTP. In a separate reaction, a control probe was 
prepared by labeling Jurkat mRNA with fluorescein-dCTP. 
The four Cy5-labeled probes were each mixed with an aliquot 
of the fluorescein-labeled control sample, and the four mix- 
tures were hybridized to separate microarrays. The arrays were 
washed and scanned for fluorescence emission, and hybrid- 
ization signals for each of the tissues samples were normalized 
to the Jurkat control to generate an expression profile for each 
of the 1046 clones present on the array. 

Detectable expression was observed for all 15 of the heat 
shock and phorboi ester-regulated genes in the four tissue 
types examined (Fig. 3). In general, the expression level of each 
gene in Jurkat cells correlated rather closely with expression in 
the four tissues (Table 2; Fig. 3). Genes encoding 0-actin and 
cytochrome c oxidase, the two most highly expressed of the 15 
genes in Jurkat cells (Table 2), were highly expressed in bone 
marrow, brain, prostate, and heart (Fig. 14).- Expression of 
cytochrome c oxidase, hsp90a, and the novel B7 sequence was 
significantly greater in heart than in the other tissues (Fig. 3), 

DISCUSSION 

Many of the heat shock genes identified in this study encode 
factors that function either as molecular "chaperones" 
(HSP90a, HSP90/3, DnaJ, TCP-1) or as mediators of protein 
degradation (polyubiquitin). The identification of these se- 
quences is consistent with the biochemical basis of heat shock 
induction (10-15). Proteins undergo denaturation at elevated 
temperatures, and those that fail to maintain proper confor- 
mation must be selectively degraded (10-15). It will be inter- 
esting to determine whether the three novel heat shock- 
inducible sequences (B7-B9) mediate protein folding and 
turnover or possess some other biochemical activity. Complete 
nucleotide sequence determination, conceptual translation, 
expression monitoring, and biochemical analysis should pro- 
vide a detailed functional understanding of these genes. 



10618 Biochemistry: Schena et aL 




Heart 
Prostate 
Brain 
Bone marrow 



Fig. 3. Transcript profiles of heat shock and phorbol ester- 
regulated genes. Gene expression levels per 100,000 mRNAs (jt-axes) 
are shown for 15 genes (Table 1) in human bone marrow (red), brain 
(green), prostate (blue), and heart (yellow). Genes are grouped 
according to expression levels (/4-C). 
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Ph rbolestcr,arx>tentactrvatorofproteinkinaseC(16, 17) 
induced a set f genes distinct from those involved in thVheat 
shock pathway. The m st highly induced gene identified in this 
study. PAC-1, encodes a nuclear tyr sine kinase that may play 
?i£? le J£ re ^ latin S transcripti n and cell cycle progression 
(18). NF-kBI, a second phorbol ester-inducible gene, is an 
intensively studied member of the Rel transcription factor 
family (19-21). The Rel proteins are activated by a large 
number of stimuli, including phorbol esters, cytokines, bacte- 
rial and viral pathogens, and ultraviolet light (19-21). Modest 
activation was observed for three sequences not known to be 
inducible by phorbol esters, including phosphoelycerate ki- 
nase, ^-microglobulin, and a novel human gene (B19). Ex- 
tensive expression monitoring with microarrays should assist in 
understanding how each of these genes integrate into the 
highly complex phorbol ester signaling pathway. 

It is striking that four novel human genes were discovered 
with an array of 1000 randomly chosen clones, particularly 
because the heat shock and phorbol ester signaling pathways 
have been so intensively studied (10-21). The facile discovery 
of these sequences underscores the fact that microarrays can 
be used for gene discovery in the absence of any sequence 
information. By this approach, clones are chosen at random 
from any library of interest and only those clones that display 
interesting expression patterns are sequenced and character- 
ized. This parallel assay, coupled with a modest DNA sequenc- 
ing facility, allows high-throughput human genome expression 
analysis and gene discovery. 

Genes that are activated or repressed by a given stimulus 
provide functional clues to the cellular pathway involved 
(22-24). Detailed examination of these gene expression "sig- 
natures" can provide a dynamic view of the mode of action of 
a given signaling substance (22-24). Microarrays may thus 
allow rapid mechanistic examination of hormones, drugs, 
elicitors, and other small molecules; moreover, functional 
analysis of transcription factors, kinases, growth factors, cyto- 
kines, receptors, and other gene products should be possible. 
Efforts are underway to develop mRNA amplification strate- 
gies to enable probe preparation from minute tissue samples. 
This capability might allow for high-throughput patient screen- 
ing in a clinical setting. 

The current detection limit of the assay allows monitoring of 
transcripts that represent ~ 1:500,000 (wt/wt) of the total 
mRNA. This 10-fold increase in sensitivity compared with the 
original report (4) was achieved largely by modifying the 
coupling chemistry, which reduced background fluorescence. 
The significance of this improvement is considerable in that 
approximately half the human genes identified in this study, 
including all four novel sequences, exhibited expression levels 
below the original detection limit of 1:50,000 (4). 

The ability to detect 2-fold changes in expression was 
achieved by the use of two-color fluorescence in the labeling 
and detection schemes, digitized data collection, and custom 
software. The importance of this capability is underscored by 
the fact that nearly all of the genes examined here exhibited 
<6-foId changes in expression. The four novel genes, which 
showed <2.2-fold activation, were probably overlooked in 
previous screens that used conventional differential expression 
techniques. It may be possible to further improve the precision 
of the microarray assay by the use of closely related fluorescent 
analogs, such as Cy3 and Cy5, in the labeling and hybridization 
reactions. 

Microarrays offer a number of advantages over other po- 
tential high-capacity approaches to expression analysis. The 
chip-based approach enables small hybridization volumes, high 
array densities, and the use of fluorescence labeling and 
detection schemes. These features provide a set of perfor- 
mance specifications that are unattainable with filter-based 
approaches (25, 26). The use of cDNA clones provides hy- 
bridization specificity that is not readily attained with oligo- 
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nucleotide arrays (27-30). The parallel format of the assay 
provides a simultaneous differential expression readout for 
>1000 genes. This contrasts with sequencing-based methods, 
which require serial data collection for expression analysis (31, 
32). A commercial source f cDNA microarrays would greatly 
speed the use of a chip-based approach to expression analysis. 

The availability of large numbers of ESTs (3) provides a rich 
resource of human cDNA clones for micro arraying. The 
>400,000 ESTs in the public data bases represent a significant 
subset of all human genes (3, 33). Microarrays of thousands of 
ESTs will provide a powerful analytical tool for future human 
gene expression studies. The -100,000 genes in the human 
genome (2, 33) emphasize the need for microarrays of greater 
density. Attempts to improve microdeposition techniques are 
underway and should allow construction of arrays containing 
a complete set of human gene targets (http://cmgm.stanford 
edu/-schena/). Microarrays of -100,000 cDNA elements 
would allow expression monitoring of the entire human ge- 
nome in a single hybridization. This capacity, coupled with 
detailed biochemical analysis of the individual gene products, 
would greatly speed the functional analysis of the human 
genome. 
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